[tmva][sofie] Restructure emitted code to be differentiable with Clad#18332
[tmva][sofie] Restructure emitted code to be differentiable with Clad#18332guitargeek merged 2 commits intoroot-project:masterfrom
Conversation
Test Results 22 files 22 suites 3d 3h 49m 52s ⏱️ Results for commit aa52030. ♻️ This comment has been updated with latest results. |
6b90cb6 to
87597cd
Compare
Proof of concept test for this PRTake this ONNX file (remove the VRlL_real_500k_evts_model.onnx.txt Here are the scripts to convert the model to C++ and then to differentiate it with Clad: // onnx_to_cpp.C
void onnx_to_cpp()
{
using namespace TMVA::Experimental;
SOFIE::RModelParser_ONNX parser;
SOFIE::RModel model = parser.Parse("./VRlL_real_500k_evts_model.onnx");
model.SetOptimizationLevel(SOFIE::OptimizationLevel::kBasic);
model.Generate();
model.PrintRequiredInputTensors();
model.OutputGenerated("./VRlL_real_500k_evts_model.hxx");
}// sofie_ad.C
#include "VRlL_real_500k_evts_model.hxx"
#include <Math/CladDerivator.h>
using Sess = TMVA_SOFIE_VRlL_real_500k_evts_model::Session;
// Wrapper functions for Clad
float my_func(Sess const &session, float const *tensor_x, float *tensor_theory_params)
{
float out = 0.;
TMVA_SOFIE_VRlL_real_500k_evts_model::doInfer(session, tensor_x, tensor_theory_params, &out);
return out;
}
float my_func_wrapper(Sess const &session, float const *tensor_x, float *tensor_theory_params)
{
return my_func(session, tensor_x, tensor_theory_params);
}
void sofie_ad()
{
// Let's go Clad!
clad::gradient(my_func_wrapper, "tensor_theory_params");
// Get a function pointer to the pullback. If you are unsure what the
// signature is, try to cast the pullback to some function pointer, like
// static_cast<void (*)(float)>(my_func_pullback) in the interpreter, and
// the compiler will tell you what the real signature is.
using Grad_t = void (*)(const Sess &, const float *, float *, float, Sess *, float *);
// Get the functions from the interpreter (remove semicolor to get the code printed)
auto grad = reinterpret_cast<Grad_t>(gInterpreter->ProcessLine("my_func_pullback;"));
std::vector<float> input1{5.0, 2.0, 1.0, -1.0, 1.0};
std::vector<float> input2{0.0};
// A trick: pre-allocate session struct both for the forward pass and
// backward pass, to that no memory allocation of intermediate tensors has
// to happend in the gradient.
TMVA_SOFIE_VRlL_real_500k_evts_model::Session s("VRlL_real_500k_evts_model.dat");
TMVA_SOFIE_VRlL_real_500k_evts_model::Session d_s("VRlL_real_500k_evts_model.dat");
// Calculate numerical gradient
auto numDiff = [&](int i) {
const float eps = 1e-4;
std::vector<float> p{input2};
p[i] = input2[i] - eps;
float funcValDown = my_func(s, input1.data(), p.data());
p[i] = input2[i] + eps;
float funcValUp = my_func(s, input1.data(), p.data());
return (funcValUp - funcValDown) / (2 * eps);
};
for (std::size_t i = 0; i < input2.size(); ++i) {
std::cout << i << ":" << std::endl;
std::cout << " numr : " << numDiff(i) << std::endl;
}
// Calculate gradient with Clad
float grad_output[]{0., 0., 0., 0., 0.};
grad(s, input1.data(), input2.data(), 1.0, &d_s, grad_output);
std::cout << " clad : " << grad_output[0] << std::endl;
}Usage with expected output (replace |
89b638c to
a3d545f
Compare
3f40542 to
78fcc20
Compare
4c9920f to
97903fa
Compare
|
Why did we decide to not pursue this? |
|
@vgvassilev, sorry that was totally an accident. Maybe I confused it with another PR, or I wanted to close and re-open the PR to run the tests, but apparently I missed the "reopen" button. |
4084f5b to
d1dfa3f
Compare
0434fc8 to
02aa923
Compare
d99553f to
aa52030
Compare
|
Is there any way we can easily compare the performance and memory footprint against say PyTorch? |
|
Yes, I'm working on that. So far, the generated gradient is not competitive because it's not optimized. I'll have to throw a few more But these optimizations can be better done in a separate PR. I also need to follow up with a |
lmoneta
left a comment
There was a problem hiding this comment.
LGTM!
Thank you Jonas for this very useful addition to SOFIE.
Apart from a minor thing, I have just a comment on the test, whether is better to keep the diff test separate from the others
…n Session ctor" This reverts commit 1f747b0. The reason for the revert is that it's actually useful to have the maximum dynamic tensor size as a datamember of the Session, because then we can refactor the generated code such that it can be differentiated with Clad.
The idea of this commit is to refactor the `doInfer()` function that implements the inference from a member function of the `Session` struct to a free function that takes the `Session` by `const`-reference. This free function should only use the session struct and bare C-style arrays, so that Clad will have no problem differentiating it. A unit test for the differentiation of a simple MLP is implemented, embedded in the existing SOFIE tests.
|
CI failure is unrelated. The SOFIE Keras parser tests fail in all PRs on |
The idea of this commit is to refactor the
doInfer()function that implements the inference from a member function of theSessionstruct to a free function that takes theSessionbyconst-reference.This free function should only use the session struct and bare C-style arrays, so that Clad will have no problem differentiating it.
A unit test for the differentiation of a simple MLP is implemented, embedded in the existing SOFIE tests.
For illustration of the changes, here is how the layout of the code emitted for the
Linear_16unit tests looks like before and after this PR:Before:
After:
One side-benefit of this refactor is that users now have a generated inference function that doesn't imply manual memory allocation of the output to be in a
std::vector, but just takes a C-style output array. The existingSession::infer()signature is unchanged for full backwards compatibility.